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Abstract 

Background: Calcium imaging in insects reveals the neural response to odours, both at the receptor level on the 
antenna and in the antennal lobe, the first stage of olfactory information processing in the brain. Changes of 
intracellular calcium concentration in response to odour presentations can be observed by employing calcium- 
sensitive, fluorescent dyes. The response pattern across all recorded units is characteristic for the odour. 

Method: Previously, extraction of odour response patterns from calcium imaging movies was performed offline, 
after the experiment. We developed software to extract and to visualise odour response patterns in real time. An 
adaptive algorithm in combination with an implementation for the graphics processing unit enables fast 
processing of movie streams. Relying on correlations between pixels in the temporal domain, the calcium imaging 
movie can be segmented into regions that correspond to the neural units. 

Results: We applied our software to calcium imaging data recorded from the antennal lobe of the honeybee Apis 
mellifera and from the antenna of the fruit fly Drosophila melanogaster. Evaluation on reference data showed results 
comparable to those obtained by previous offline methods while computation time was significantly lower. 
Demonstrating practical applicability, we employed the software in a real-time experiment, performing 
segmentation of glomeruli - the functional units of the honeybee antennal lobe - and visualisation of glomerular 
activity patterns. 

Conclusions: Real-time visualisation of odour response patterns expands the experimental repertoire targeted at 
understanding information processing in the honeybee antennal lobe. In interactive experiments, glomeruli can be 
selected for manipulation based on their present or past activity, or based on their anatomical position. Apart from 
supporting neurobiology, the software allows for utilising the insect antenna as a chemosensor, e.g. to detect or to 
classify odours. 



Introduction 

Motivation 

Odours take many shapes, and equipped with an insect 
brain and a neuroimaging device one can reveal these 
shapes, turning chemicals into patterns and images. 

In the conference version of this paper [1], we have 
introduced an imaging system that can read out and 
process brain activity in real time, making the neural 
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representations of odours accessible. The biological moti- 
vation is that access to ongoing brain activity is the basis 
for analysing storage and processing of information in 
the brain, observing, for example, the activity patterns in 
response to odour stimulation. In particular, transform- 
ing odours into patterns and images does not only benefit 
basic neuroscientific research: It also allows us to utilise a 
living organism with highly sensitive olfactory organs as a 
chemosensor, where the patterns and the distances 
between them contain information about odour identity 
and dissimilarity. 
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We consider two application scenarios for real-time 
visualisation of odours using insect brains. In insects, the 
first stage of odour perception is formed by the odour 
receptor neurons on the antenna. Here, we utilise cal- 
cium imaging to record from the antenna of the fruit fly 
Drosophila melanogaster. While such data provides only 
little information about signal processing in the brain, 
receptor neurons on the antenna are easy to access 
experimentally and they are excellent chemosensing 
devices. As such they are a promising alternative to artifi- 
cial chemosensors, also referred to as electronic noses 
(see e.g. [2-5]), that find application in environmental 
monitoring, chemical industry or security. 

The second stage of odour processing in the insect 
brain is the antennal lobe (AL), a dedicated olfactory cen- 
ter where odours are represented by activity patterns of 
neural units, the so-called glomeruli [6]. A network of 
interneurons connects the glomeruli, and unravelling the 
function of this network in processing odour information 
is the topic of ongoing research. The honeybee AL is an 
established model for studying odour learning and mem- 
ory [7], and neuropharmacological tools [8,9] have been 
developed to manipulate the network of interneurons. 
Here, the real-time aspect of odour visualisation is espe- 
cially relevant as decisions can be based on prior activity, 
targeting e.g. glomeruli that have previously been part of 
a response pattern. 

From an image processing perspective, both application 
cases are similar. Activity patterns in the honeybee AL 
[10], and on the Drosophila antenna [11] are accessible 
through calcium imaging with fluorescent calcium repor- 
ters. Calcium imaging movies report fluorescence changes 
over time. Figure 1 gives an example for an imaging movie 
recorded from the honeybee AL, showing both the noisy 
raw images and processed, denoised versions of these 
images displayed in a false-colour scale. 



The algorithm presented in this work computes a low- 
rank approximation based on few, selected (pixel) time 
series from the movie matrix. Exploiting the fact that there 
is noise and redundancy (time series from the same glo- 
merulus are correlated) in the data, the movie matrix can 
be represented by another matrix of much lower rank. The 
rank-reduced version contains less noise, revealing the 
positions of the glomeruli and their signals [12]. As the 
glomerulus shapes become visible in Figure 1, such a repre- 
sentation may also be referred to as a segmentation, where 
the movie is segmented into regions with correlated activity 
over time, regions that correspond to the glomeruli. 

Prior approaches to processing such imaging data involve 
manual feature selection [13,14], or, if automated, they can 
only be performed offline [12,15]. The real-time imaging 
system presented in this work allows for a wider range of 
applications, including closed-loop experiments, and it 
is what makes the readout from the antenna practical 
for chemosensing applications. Computation is performed 
in an adaptive manner, processing the movie stream 
incrementally as the images arrive from the camera. Fast 
processing is ensured by a GPGPU (General Purpose Com- 
putation on Graphics Processing Unit) implementation. 
The following sections provide further biological back- 
ground and report on prior work. We then present algo- 
rithms and implementation details (Methods), followed by 
evaluation and demonstration of biological application 
cases (Results and discussion): Using the real-time imaging 
system, we visualise spontaneous activity and odour 
responses in the honeybee AL, and we provide a proof of 
concept for practical chemosensing with a biological sensor. 

Biological background 
The olfactory system 

The structure of olfactory systems is similar between 
insect species. As an example, we provide numbers for the 
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Figure 1 Odour response patterns in the honeybee antennal lobe, a) Anatomical model of the honeybee AL (modified from [55]). 
Landmark glomeruli (17, 33, 42) are labelled according to the numbering from [50], b) Frontal view on the AL by calcium imaging. Above: Raw 
images (ratio: 340/380) from the movie. Below: Images processed with the real-time software. Glomeruli exhibit spontaneous background activity, 
or they respond to the odour during stimulus presentation (black bar). 



Strauch et al. BMC Bioinformatics 2013, 14(Suppl 19):S6 
http://www.biomedcentral.eom/1 471 -2 1 05/1 4/S1 9/S6 



Page 3 of 15 



honeybee Apis mellifera. On the antenna, approximately 
60,000 odour receptor neurons interact physically with the 
odour molecules. These 60,000 neurons converge onto 160 
glomeruli in the AL, where each glomerulus collects input 
from receptor neurons of one type. The glomeruli are bun- 
dles of synapses and appear as spherical structures with a 
diameter between 30 and 50 ftm. At the AL stage, each 
odour is represented by an activity pattern across the 160 
glomeruli. Further upstream, this compact representation 
is widened again, as the projection neurons project from 
the glomeruli to approximately 160,000 Kenyon cells, a 
stage where odours are represented by sparser, high- 
dimensional patterns. [16,17] While implementation details 
differ between species, the combinatorial coding of odours 
by activity patterns across glomeruli, in the insect AL or in 
the vertebrate olfactory bulb, is a common feature of olfac- 
tory systems and can also be found in humans [18]. 
Odour responses in the AL 

Calcium imaging using calcium-sensitive fluorescent dyes 
grants us access to the odour response patterns in the AL 
of the honeybee Apis mellifera [10]. These odour response 
patterns reflect the response properties of the odour 
receptor neurons, as well as additional processing that 
takes place in the AL. Interneurons connecting the glo- 
meruli perform further computations such as contrast 
enhancement [10]. 

For an example of odour response patterns in the hon- 
eybee AL, see Figure 1. The activity pattern of glomeruli 
(between ca. 20 and 40 of the glomeruli are visible in an 
imaging movie) fluctuates at a low amplitude when no 
odour is present. After stimulation with an odour (indi- 
cated by the black bar), glomeruli exhibit individual 
responses to the odour. As a result, the activity pattern 
across glomeruli changes in way that is characteristic for 
the odour. The same odour elicits a similar pattern in dif- 
ferent bees [6]. 

There is evidence that not only the identity of a parti- 
cular odour is encoded by the corresponding glomerular 
response pattern, but that also chemical [19] and per- 
ceptual [20] (dis)similarity are reflected by the (dissimi- 
larity of response patterns, suggesting that response 
pattern space is a rather faithful representation of che- 
mical space. 

Odour responses on the antenna 

Glomerular response patterns, as measured in the honeybee 
recordings from this work, are the output signal of the AL, 
i.e. they are the result of integrating all receptor neurons of 
one type and of further processing that occurs in the AL 
network of interneurons. While odour coding is improved 
after this processing [21], the results of this paper suggest 
that chemical identity and (dis)similarity can already 
be inferred from receptor neuron signals recorded on the 
antenna, the earliest stage in the olfactory processing 



pipeline where response patterns are easily accessible with- 
out dissecting the brain. 

In this work, antenna data was recorded in the fruit fly 
Drosophila melanogaster. The genetic tools that are avail- 
able for Drosophila make it possible to express a calcium 
reporter directly in the receptor neurons on the antenna. 
Instead of expressing the reporter in cells of one type, as 
has been done before [11], the approach pursued here is 
to measure signals from a large set of different receptor 
cells that all express the general olfactory co-receptor 
Oreo that these cells bear in addition to a specific odour 
receptor. This allows us to measure broad odour 
response patterns across many receptors. The segmenta- 
tion approach presented in this paper is then used to to 
extract individual response units from the imaging movie 
based on their differential responses to a series of 32 dif- 
ferent odours. 

Related work 

Computational approaches to analysing imaging data can 
be classified as being either synthetic or analytic. In a 
synthetic approach, similar to common procedures for 
analysing fMRI data, Stetter et al. [22] have set up non- 
linear functions that they fitted to the individual (pixel) 
time series of the imaging movie. These functions can 
account e.g. for dye bleaching over time and for different 
neural signal components. Rather than performing bot- 
tom-up synthetic reconstruction of the imaging movie, 
analytic approaches decompose (bottom-down) the 
movie into factors. These are matrix factorisation or 
decomposition methods that exist in many different fla- 
vours, e.g. the well-known Principal Component Analysis 
(PCA). In particular Independent Component Analysis 
(ICA) has found widespread application on imaging data 
[15,23-26]. While ICA can be seen as a matrix decompo- 
sition method, the motivating paradigm for ICA is source 
separation. Under the assumption that there are underly- 
ing source signals that are statistically independent (and 
non-Gaussian), ICA algorithms (e.g. [27]) aim at recover- 
ing or separating these source signals on sample data 
where the sources appear in mixed form, e.g. neural sig- 
nals mixed with measurement artifacts. 

A recent convex analysis approach [12] (s.a. Methods), 
performs a factorisation of the movie matrix based on 
extremal column vectors from the boundary of the convex/ 
conical hull of the data. Under the assumption that pure 
signal sources are present in the data, finding the extremal 
column vectors identifies these pure signal sources. 

Traditionally, calcium imaging data from the insect AL 
has been processed by semi-automatic methods that per- 
form e.g. image smoothing, but that still require human 
interaction to select regions of interest [13,14]. From the 
methods listed above, those that require human interaction 



Strauch ef al. BMC Bioinformatics 2013, 14(Suppl 19):S6 
http://www.biomedcentral.eom/1 471 -2 1 05/1 4/S1 9/S6 



Page 4 of 1 5 



appear less suited for real-time processing on a movie 
stream. So far, no real-time implementations of the com- 
putational approaches exist, the software implementations 
from [12,15] being only suited for offline data analysis. 

Methods 

Biological methods 
Imaging the honeybee AL 

For honeybees, it has been shown that projection neu- 
ron firing rate correlates with changes in intracellular 
calcium [28]. Staining with calcium-sensitive fluorescent 
dyes and excitation of the dyes with UV-light thus leads 
to a good proxy signal for brain activity [29]. 

Calcium imaging with forager honeyebees (Apis melli- 
fera) was performed as described in [30]. Projection 
neurons in the 1-APT and m-APT (lateral/ medial antenno- 
protocerebral tract) were stained with Fura2-dextran 
(Invitrogen, Molecular Probes, Eugene, OR, USA), a cal- 
cium-sensitive, fluorescent dye. Activity of the projection 
neurons, that depart from the glomeruli in the AL, could 
be recorded using the experimental setup displayed in 
Figure 2. A fluorescence microscope (Axio Imager D.l, 
Zeiss, Gottingen, Germany) was equipped with a water 
immersion objective (20 x, NA 0.95, Olympus, Tokyo, 
Japan). A light source (Polychrome V, TILL Photonics, 
Grafelfing, Germany) provided excitation light at 340 and 
380 nm, and fluorescence was recorded with a CCD 



camera (Andor Clara, Andor Technology PLC, Belfast, 
Northern Ireland). The input signal for data processing 
was computed as the ratio between consecutive images 
recorded at 340 and 380 nm, a standard procedure for 
Fura2-dextran [31]. 
Imaging the Drosophila antenna 

Animals Animals used for the experiments were female 
Drosophila melanogaster that were reared at 25 °C in a 12/ 
12 light/dark cycle. Flies were of genotype w; P[Orco:Gal4]; 
P[UAS:GCaMP3]attP40, expressing the calcium reporter 
G-CaMP3 [32,33] in all Oreo (olfactory co-receptor) bear- 
ing cells (UAS-GCaMP3 flies were provided by Loren 
L. Looger, Howard Hughes Medical Institute, Janelia Farm 
Research Campus, Ashburn, Virginia, USA). 
Odorant preparation Odorants were purchased from 
Sigma- Aldrich in the highest purity available. Pure sub- 
stances were diluted in 5 mL mineral oil (Sigma- Aldrich, 
Steinheim, Germany) to a concentration of 10' vol/vol. 
Odours were prepared in 20 mL headspace vials, covered 
with nitrogen and sealed with a Teflon septum (Axel 
Semrau, Germany). Odorants used were: 2-propylphenol 
(644-35-9), alpha-ionone (127-41-3), alpha-bisabolol 
(23089-26-1), trans-caryophyllene (87-44-5), (R)-carvone 
(6485-40-1), (S)-carvone (2244-16-8), beta-citronellol 
(106-22-9), 4-allyl-l,2-dimethoxybenzene (93-15-2), ethyl 
3-hydroxyhexanoate (2305-25-1), ethyl (R)-3-hydroxybu- 
tanoate (24915-95-5), eugenol (97-53-0), E, E-farnesol 




Figure 2 Experimental setup for honeybee brain imaging. Setup for the honeybee imaging experiments. After excitation with light at 
wavelengths 340 nm and 380 nm (light source not shown), fluorescence is recorded by a CCD camera mounted on top of a confocal 
microscope. Temperature is controlled by a heat lamp. The photograph shows the setup before the experiment, which is carried out in the dark. 
During the experiment, camera signals are processed in real time and results are displayed on the visualisation screen. Odour stimuli can be 
applied with a syringe (not shown), before being sucked out through the air exhaust. 
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(106-28-5), geraniol (106-24-1), heptyl acetate (112-06-1), 
hexyl acetate (142-92-7), hexyl butyrate (2639-63-6), 
isoamyl tiglate (41519-18-0), iso-eugenol (97-54-1), 
4-isopropylbenzaldehyde (122-03-2), linalool (78-70-6), 
methyl 3-hydroxy hexanoate (21188-58-9), 4-methoxy- 
benzaldehyde (123-11-5), methyl jasmonate (39924-52-2), 
(lR)-myrtenal (564-94-3), nonanal (124-19-6), nonanone 
(821-55-6), octyl acetate (112-14-1), phenylacetaldehyde 
(122-78-1), 4-hydroxy-3-methoxybenzaldehyde (121-33-5), 
gamma-propyl-gamma-butyrolactone (105-21-5), alpha- 
terpineol (10482-56-1) and alpha-thujone (546-80-5). 
Stimulus application A computer-controlled autosam- 
pler (PAL, CTC Switzerland) was used for automatic 
odour application. 2 mL of headspace was injected in two 
1 mL portions at timepoints 6 s and 9 s with an injection 
speed of 1 mL/s into a continuous flow of purified air 
flowing at 60 mL/min. The stimulus was directed to the 
antenna of the animal via a Teflon tube (inner diameter 
1 mm, length 38 cm). 

The interstimulus interval was approximately 2 min. 
Solvent control and reference odorants (heptyl acetate 
and nonanone) were measured after every five stimuli 
(one block). The autosampler syringe was flushed with 
purified air for 30 s after each injection and washed with 
pentane (Merck, Darmstadt, Germany) automatically 
after each block of stimuli. 

Calcium imaging Calcium imaging was performed with a 
fluorescence microscope (BX51WI, Olympus, Tokyo, 
Japan) equipped with a 50x air lens (Olympus LM Plan FI 
50x/0.5). A CCD camera (TILL Imago, TILL Photonics, 
Grafelfing, Germany) was mounted on the microscope, 
recording with 4x4 pixel on-chip binning, resulting in 
160x120 pixel sized images. For each stimulus, recordings 
of 20 s at a rate of 4 Hz were performed using TILL Vision 
(TILL Photonics, Grafelfing, Germany). 

A monochromator (Polychrome II, TILL Photonics, 
Grafelfing, Germany) produced excitation light of 470 nm 
wavelength which was directed onto the antenna via a 
500 nm low-pass filter and a 495 nm dichroic mirror. 
Emission light was filtered through a 505 nm high-pass 
emission filter. 

Flies were mounted in custom-made holders, placed 
with their neck into a slit. The head was fixed to the 
holder with a drop of low-melting wax. A half electron 
microscopy grid was placed on top of the head, stabilising 
the antenna by touching the 2nd, but not the 3rd antennal 
segment. 

Matrix factorisation framework 

We first describe the general matrix factorisation frame- 
work for imaging movies. The framework is illustrated in 
Figure 3. An imaging movie can be cast into matrix form 
by flatting the two-dimensional images with n pixels into 
row vectors of length n. The movie matrix A m * " has 



m time points and n pixels. The rows of the movie matrix, 
A(i), contain images or time points. The columns, A^'\ 
contain pixels or time series. 

We consider a factorisation of A into a matrix T m x k of 
k time series and a matrix S * " of k images, where 
k <£n,m. This provides a low-rank approximation A k to 
the original matrix A: 

k 

A mxn : A fe = r mxfe S fex " = T 'r s n W 

r=l 

In imaging movies, all pixels that report the signal of the 
same glomerulus are correlated with each other (apart 
from the noise), which causes redundancy in A. It is thus 
possible to construct a good approximation with small k, 
such that || A — TS || is small. 

The optimal rank-£ approximation with respect to the 
aforementioned norm difference can be computed with 
Principal Component Analysis (PCA) [34]. However, the 
images in S computed by PCA are not sparse, with 
almost all pixels being different from zero [12]. The 
images in 5, and the corresponding time series in T, can 
thus hardly be interpreted as the boundaries or the signal, 
respectively, of a particular neural unit. By definition, 
principal components need to be orthogonal to each 
other, which often prevents them from closely fitting the 
underlying source signals. 

Ideally, as in the example from Figure 3, the images in 
5 should be sparse, with only few pixels being different 
from zero. The k time series in T should be selected from 
k different glomeruli with the corresponding rows in the 
sparse S marking positions and boundaries of the glomer- 
uli. We have shown in [12] that there is a method, the 
convex cone algorithm, that can achieve a factorisation 
with such favourable properties on imaging data. 

Convex cone algorithm 

In this section, we review the convex cone algorithm from 
[12]. It is based on a non-negative mixture model for ima- 
ging data: 

A = TS 0+ + N (2) 

The movie matrix A can be described by basis time ser- 
ies in T that are combined by coefficients in the non- 
negative matrix S° + . Residual noise is accounted for by N. 

We assume that the columns A w of the movie matrix 
contain either pure glomerulus signals or mixed signals, i. 
e. linear combinations (with non-negative coefficients) of 
the pure signals. At the fringes of a glomerulus, close to 
the neighbour glomeruli, such mixed signals can occur 
when a glomerulus signal is contaminated with additive 
light scatter from one or more neighbour glomeruli. Even 
if a glomerulus does not respond to an odour, light scatter 
can give the impression of a signal. In the middle of the 
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Figure 3 Illustration of the matrix factorisation framework. Matrix factorisation framework for imaging movies: The movie in matrix A is 
approximated by the product of the k time series in 7" and the k images in S, forming the rank-/c matrix A k . 



glomeruli, that are rather large, circular objects, light scat- 
ter from the (distant) pixels of the neighbour glomeruli is 
less likely, and we assume that here the pixels contain 
pure signals. 

For the matrix factorisation framework from Figure 3, 
we would like to select one pure signal from each glomer- 
ulus into T. Mixtures can then be modelled by 5° + . While 
S° + can be computed easily given A and T, the challenging 
part is the selection of time series from the glomeruli 
into T. 

Geometrically, the columns in T span a convex cone 
[35,36] that contains a part of the data points in A. Data 
points that lie within the cone can be reconstructed 
exactly by linear combination (with non-negative coeffi- 
cients) of the columns in T . Data points that lie outside of 
the cone can be approximated by projecting them to the 
boundary of the cone, where the approximation error 
depends on the distance to the boundary. 

From convex analysis we know that the set of extreme 
vectors of A is the minimal generator of the convex cone 
that contains the entire A [35,36]. With the extreme vec- 
tors we can span a volume that contains all data points of 
A and that thereby reduces the approximation error to 
zero. For imaging movies, the extreme columns vectors 
are also the columns with the pure signals from the mid- 
dle of the glomeruli, whereas the mixed signal columns, 
that can be combined from the extreme, pure signal col- 
umns, lie within the cone. 

Following this motivation, the convex cone algorithm 
[12] makes locally optimal choices for the next extreme 
column vector. With each new vector selected by the 
algorithm, the columns in T span a larger (>) volume. 

The convex cone algorithm starts with matrix Am := A, 
selecting the column with index p that has the largest 

Euclidean norm: argmaxp||A[jj|| This column becomes 

the first column of T, jW := A$L Then, a matching 

S(i) := AjjjT^ 1 ' is computed (for simplicity we omit the 



non-negativity constraint on S). The movie matrix is 
downdated as A{2j := A(ij — T^Sm. In the new matrix A 
(2} at iteration 2, the influence of the first column T^ 1 ' is 
removed. We then select the column that is farthest away 
from the boundary of the cone, i.e. the column with the 
largest norm in Aj 2 j. This is an estimate for the next 
extreme column vector, and we fill this column into I^ 2 \ 

We repeat the process until c columns are selected. In 
the following, we reserve c for the (user-specified) num- 
ber of columns selected by the convex cone algorithm, 
and k for the number of principal components in the 
PCA step that is performed before the convex cone 
algorithm. 

Working on a movie stream 

There are two motivations for performing PCA as a pre- 
processing prior to the convex cone algorithm. First, keep- 
ing only the top-/c principal components reduces noise, 
which can make selection of extreme columns more 
robust. Second, we can utilise PCA to reduce computation 
time. For the real-time application, the movie matrix A 
grows by one row at each time point. The complexity of 
the convex cone algorithm is in the order 0(mnc) if run 
once on the complete matrix A. For a growing movie 
matrix this would quickly accumulate a large overhead, 
the cost of performing the convex cone algorithm at each 
time point being 0(lnc + 2nc + . . . + mnc). 

If we utilise PCA to keep, at all times, a compact sum- 
mary matrix of constant size, we can remove the depen- 
dency on the growing time dimension. We propose to 
use an incremental PCA (IPCA) approach that computes 
the matrix V k of the top-/c principal components at each 
time point, where V k is updated at low cost given the old 
version of V k and the current image received from the 
movie stream. The convex cone algorithm is then no 
longer performed directly on A, but on V k . As V k is the 
minimiser of ||A — Vfe||, moderate values for k are suffi- 
cient in practice. 
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Several publications have treated IPCA algorithms 
[37-42]. Here, we rely on the CCIPCA algorithm by Weng 
et al. [39]. Several successful applications of CCIPCA can 
be demonstrated [43-45]. In these cases, CCIPCA was also 
used to incrementalise another algorithm by providing an 
updated version of matrix V k at each time point. CCIPCA 
costs a constant 0(nk) operations per update, which 
amounts to 0(mnk) for processing the entire movie once. 

Here, we outline the basic principle behind CCIPCA. 
The first principal component is approximated as the 
mean of the images received so far. The second principal 
component is approximated as the mean of the images 
from which the projection onto the first PC has been 
subtracted, etc. This approach allows for incremental 
updates, and it completely avoids the time-consuming 
construction of a large n x n covariance matrix, which 
would be required by standard PCA approaches that 
compute the eigenvectors of the covariance matrix. In 
the following, we briefly outline the CCIPCA iteration. 
For further details on CCIPCA, see [39]. A convergence 
proof is given in [46]. 

We assume that the movie matrix grows by one image, 
A(i), at time point i. The r =1, k rows of the principal 
component matrix V = V* are initialised with k arbitrary, 
orthogonal vectors. Then, V is upated at each time point 
using the current image A^, where principal component 
V( r ) at time point i is denoted V^: 
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Then, image A w is downdated by subtracting the projec- 
tion onto V^. 
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After updating the rth principal component in this 
way, we can return to Equation (3) to update V (r+1) 
(Algorithm 1). 

Algorithm 1: V i] = UpdateJPCA (V {i_1 U«), k , i) 

for all r e [0, k - 1] do 



v; 



i-i} 



A (i ) := A W 



end for 



A T 



v: 



(r) 



V, 



V, 



(0 



V, 



Cone_updating: Visualisation in real time 

Combining CCIPCA (Algorithm 1) and the convex cone 
algorithm leads to the algorithm at the core of the real- 
time imaging system: Cone_updating (Algorithm 2). Each 
image A(i) at time point i is first preprocessed by pixel- 
wise z-score normalisation: Subtract fi, the mean, and 
divide by a, the standard deviation. Both, p and a of a 
pixel, can be updated as the movie stream proceeds. 

After normalisation, the matrix V of the top-/: principal 
components is updated with the current image: V* 1 ' := 
UpdateJPCACV 11 " 1 ', A{i), k, i). Finally, the Convex_cone_ 
algorithm (V* 1 ', c) is applied to select c pixels (columns) 
from the current version of V. 

As the movie matrix A grows, the incremental estimates 
for n, a and V improve. As a consequence, the c columns 
selected by the convex cone algorithm are better estimates 
of the extreme column vectors of A, the vectors that con- 
tain the pure glomerulus signals. In the matrix factorisa- 
tion framework, these are are the columns for matrix T, 
and the corresponding S indicates glomerulus position 
(Figure 3). 

Visualisations of brain activity, such as in Figure 1, can 
be achieved by low-rank approximation, using matrices 
T and S: A k = TS. At time point i we do not yet know 
the final T and S, and therefore we obtain the approxi- 
mation A(,) osAr,) := A^S M S {, \ where s 1 ' 1 is the current 
version of S. 

For offline data visualisation, the colour scale can be 
adjusted to the maximum and minimum value of A. For 
real-time display, using one colour scale for the entire 
movie, maximum and minimum have to be updated 
incrementally. To avoid level changes, e.g. by long-term 
photobleaching of the calcium dye, data was high-pass 
filtered (0:025 Hz) before display in a false-colour scale 
(as in Figure 1). 

Algorithm 2:5= Cone_updating (A (m * "\ c, k) 
Initialise 

for all i e [0, m - 1] do 

A(/) := z_score_normalise (A^) 
if i > 1 then 
V®:= Update_IPCA(\/ !i '' 1 U(/), k, i) 

:= Convex_cone_algorithm (V* 1 ', c) 
Ar,-) := Af^S'^S'' 1 // low-rank approximation to 
image A (i) 
end if 
end for 

Implementations 

We consider three implementations of the convex cone 
algorithm. Two implementations were written in Java, 
Java_offline, the reference implementation from [12], and 
Java_online. Java_offline performs exact offline PCA, fol- 
lowed by the convex cone algorithm, whereas Java_online 
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(implementation of Algorithm 1) uses incremental PCA 
instead. Both Java implementations were performed in 
KNIME [47] (http://www.knime.org), a data pipelining 
environment. 

Finally, we implemented the incremental online variant 
(Algorithm 1) using GPGPU: GPGPU_online. Z-score 
normalisation and the time-consuming PCA were imple- 
mented for the GPU with the NVIDIA CUDA [48] Basic 
Linear Algebra Subroutines (cuBLAS) (http://developer. 
nvidia.com/cublas) and the CUDA Linear Algebra library 
(CULA) (http://www.culatools.com/). The actual convex 
cone algorithm was run on the CPU. 

TILL Photonics Live Acquisition (LA) Software 2.0 [49] 
was employed to control experimental hardware and exci- 
tation light intensity. GPGPU_online accessed the movie 
stream directly from the camera using a custom-built soft- 
ware interface kindly provided by TILL Photonics. 

Results and discussion 

This sections starts with a technical evaluation of the pro- 
posed algorithm and a comparison of different implemen- 
tations. We then demonstrate practical applicability in an 
experiment with honeybees and show how the techniques 
developed in this work can be utilised to turn the insect 
antenna into a living chemosensor. We conclude with a 
discussion regarding the impact that real-time processing 
of neural activity can have. 

Performance measures 
Computing time 

The motivations for adapting the matrix factorisation fra- 
mework to the datastream domain were the ability to per- 
form incremental updates upon arrival of new data, and of 
course the ability to process data with minimal time delay. 
For evaluation, we performed computation time measure- 
ments on a reference dataset. Measurements were carried 
out using an Intel Core i7 950 (3.07 GHz) CPU and a 
NVIDIA. 

GeForce GTX 285 (648 MHz, 1024 MB) GPU. The 
Java_offline and Java_online implementations were run in 
KNIME (http://www.knime.org) workflows, and, for com- 
parability with the C-implementation GPGPU_online, 
time measurements do only include the actual computa- 
tion time and not the time for data transfer between 
nodes in the KNIME workflow. 

The dataset consisted of 11 imaging movies of the hon- 
eybee AL (a part of the dataset was shown in [12]) with ~ 
170 x 130 pixels and « 3500 time points each. The average 
length was about 15 minutes per movie. Table 1 reports 
overall computation time (in minutes) for the entire data- 
set and computation time per frame, averaged over all 11 
movies. Both the incremental approximation to PCA and 
the GPGPU implementation contributed to the speedup. 
Java_online, that uses incremental PCA, achieved an 



approximately 1.5-fold speedup over Java offline that is 
based on exact, conventional PCA. GPGPU_online 
achieved an additional 2-fold speedup over Java online by 
using the GPU instead of the CPU. The parallelisation 
abilities of the GPU ensure scalability to future increases 
in data size, i.e. higher speedups are expected for data with 
higher resolution. 

With the fastest implementation, GPGPU_online, a sin- 
gle image from the movie can be processed in 23 ms 
(Table 1), which is sufficient for calcium imaging in hon- 
eybees and Drosophila with typical recording frequencies 
below 20 Hz. 
Approximation quality 

The fastest implementation, GPGPU_online, achieves a 
significant speedup over the offline reference implementa- 
tion. We next evaluated the quality of the results com- 
puted with GPGPU_online. There is an algorithmic 
approximation involved, and GPU computations are per- 
formed with float precision instead of double precision. In 
the online setting, incremental z-score normalisation was 
imperfect whenever there were mean shifts during the 
course of the experiment. How does this affect the quality 
of the results? For visualisation, we constructed maps of 
the glomeruli in the AL by overlaying all images from the 
rows of matrix S, the images that show the positions and 
boundaries of the glomeruli (see Figure 3). Such glom- 
erular maps reveal the anatomy of the AL and can be 
matched between bees [12]. Using the reference results 
from [12], we compared glomerulus maps computed 
by Java_offline and GPGPU_online on the same movie. 
Parameters were k = 50 principal components, c = 50 
(convex cone algorithm). Figure 4a shows glomerulus 
maps constructed by the two implementations. Clearly, 
both implementations reveal the same anatomy, but there 
is no perfect correspondence between the maps. 

Apart from visual inspection of glomerulus maps, 
where cluster size has a strong impact, we also analysed 
how robust signal (column) selection was against incre- 
mental approximation. The convex cone algorithm 
selects extreme column vectors into matrix T that corre- 
spond to the pure signal sources. Matrix S is computed 
given A and T , and the rows of S reflect the distribution 
of similarity with the corresponding signal sources in T. 



Table 1 Computation time measurements were 
performed on the 1 1 imaging movies of the reference 
dataset. 



Implementation 


(ms/frame) 


overall time consumption (min) 


Java_offline 


134 


68.15 


Java_online 


65 


39.26 


GPGPU_online 


23 


18.31 



We report average computation time per frame and overall computation time 
for the entire dataset. 
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Figure 4 Quality evaluation: Online implementation vs. offline reference implementation a) Comparing results of Java_offline and 
GPGPU_online using glomerulus maps for three bees (data from [12]). Top: Java_offline. Bottom: GPGPU_online. b) Positions of signals (columns) 
selected by the different implementations. Black circles: Positions of all c = 50 columns selected by GPGPU_online. Red triangles: Subset of 
signals (columns) selected by the reference implementation Java_offline. The subset contains only those signals that correspond to glomeruli. 
Glomeruli were identified based on their position in the maps, using the anatomical honeybee AL atlas [50]. 



This gives rise to the clusters of similar pixels in S and on 
the glomerulus map. Preprocessing, such as incremental 
z-score normalisation and incremental PCA, and a post- 
processing step employed in [12] to remove the residual 
noise N (Equation 2) all have an impact on signal similar- 
ity and thus influence cluster size. This affects especially 
clusters that correspond to e.g. areas of similarly strong 
background staining, illumination artifacts etc. that do 
not have such clearly distinct signals as the glomeruli. 

To evaluate signal selection independent of cluster size, 
we visualised the positions of the columns selected by the 
offline reference implementation Java_offline, along with 
the positions of the columns selected by GPGPU_online. 
For the reference implementation, we included only glo- 
merular signals, i.e. those that could be identified by 
matching glomerulus maps to the anatomical honeybee 
AL atlas [50]. Figure 4b shows that the positions of sig- 
nals selected by GPGPU_online (black circles) are in 
good correspondence with the glomerulus "targets" pro- 
vided by Java offline (red triangles). We conclude that 
selection of relevant signals, i.e. glomerular signals, is 
robust against incremental approximation in the online 
setting. 



Documentation of a real-time experiment in the 
honeybee AL 

To demonstrate practical applicability, we performed a 
real-time experiment with honeybees. We used the 
experimental setup from Figure 2 and GPGPU_online, 
the software implementation that proved fastest in the 
evaluation. For a screenshot, see Figure 5. During the 
experiment, three windows were constantly updated: The 
raw fluorescence signal, shown as the ratio between con- 
secutive measurements with 340 nm and 380 nm excita- 
tion light (see Methods), a map of the glomeruli in the 
AL, and the low-rank approximation to the current 
image. Movie documentations of the real-time experi- 
ment (Additional files 1 and 2) are available online. 

The glomerulus map is a segmentation of the image 
plane into regions with correlated activity over time. With 
the growing movie stream, more and more information 
about correlations between pixels becomes available. 
Figure 6 shows the gradual development of a glomerulus 
map during the course of the experiment. While at early 
time points many of the c basis signals were still influ- 
enced by the initialisation of the (not yet converged) incre- 
mental PCA (row of points in the left upper corner), they 
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quickly moved to the positions of the glomeruli as more 
information arrived from the stream. Already after 400- 
600 time points, the map was almost complete. If needed, 
such an incremental computation of the glomerulus map 
could also be used to adapt to changes: For example, shifts 
between images caused by animal movement could be 
corrected for by giving higher weights to more recent time 
points. 

Figure 1 visualised glomerular activity in response to 
odour stimulation. During the real-time experiment, we 
also observed spontaneous activity of glomeruli in the 
absence of odour stimulation, and also this spontaneous 
activity could be visualised by low-rank approximation: 
Figure 7. The ability to detect low-amplitude signals in 
spontaneous activity is relevant for the application scenar- 
ios discussed below (Impact of real-time processing). 

The insect antenna as a chemosensor 
Functional segmentation of the antenna 

While recording glomerular activity in the honeybee AL 
can help to answer neurobiological questions, the experi- 
mental procedure is technically demanding and time- 
consuming as it requires to dissect the brain and to fill in 
a calcium-dye into the projection neurons the day before 
the experiment. Working with the model organism 
Drosophila melanogaster, we could omit the manual 
staining step, genetically expressing the calcium-sensitive 
dye G-CaMP [32,51] in the receptor neurons that are 
accessible without dissecting the brain (see Methods). 



This experimental design allows us to employ a Droso- 
phila antenna as an easy-to-handle chemosensor. Similar 
to the situation in the honeybee AL, stimulating the 
antenna with a series of odour presentations elicits differ- 
ential responses in the receptors that can then be distin- 
guished based on their activity over time. We can thus 
utilise the real-time software to construct a map of the 
Drosophila antenna and to observe odour responses. 

We employed a series of 32 different odour stimulations, 
two of which were control odours that were applied multi- 
ple times. This stimulation protocol elicited sufficiently 
diverse responses to allow for functional segmentation 
(with GPGPU_online). For Figure 8, we computed a map 
of the antenna consisting of c = 100 response units. The 
antenna map is shown in 8a, and Figure 8b contains a sin- 
gle image from the calcium imaging movie. 

In Drosophila, olfactory receptor neurons are anisotropi- 
cally spatially scattered over the surface of the antenna, so 
that odour response patterns form spatio-temporal maps. 
However, sources could be overlapping, creating a situa- 
tion that is more complex than with the image of well seg- 
regated olfactory glomeruli in the antennal lobe. 
Data analysis 

We thus investigated whether the response units had 
indeed biologically or chemically meaningful signals. After 
processing the entire movie stream, we analysed the 
response unit time series in matrix T. For each odour 
stimulation, we computed a feature vector, where the 
feature was the maximum value of the response unit after 



time point 68 126 185 362 656 1478 
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Figure 6 Real-time experiment-Incremental map construction. Development of a glomerulus map during a real-time experiment. The 
images show incremental updates of the map at different time points. Parameters: k = 50 (number of PCs), c = 50. As preprocessing, images 
were spatially filtered (Gaussian kernel, width = 9). 



Strauch ef al. BMC Bioinformatics 2013, 14(Suppl 19):S6 
http://www.biomedcentral.eom/1 471 -2 1 05/1 4/S1 9/S6 



Page 11 of 1 5 



raw data 




•It 



imax 



LA 




mm 
max 



mm 



Figure 7 Real-time experiment-Visualisation of spontaneous background activity Spontaneous background activity of glomeruli in the 
honeybee AL, visualised by low-rank approximation. Top: Fluorescence images recorded at 5 Hz. Each image is the ratio of consecutive images 
recorded with 340 nm and with 380 nm excitation light (see Methods). Bottom: The same images after processing with the real-time software. 



presentation of the respective odour (and before the start 
of the next odour measurement): Figure 9a. 

Clustering these feature vectors (Figure 9b) shows that 
feature vectors for repeated applications of the same 
odour, e.g. nonanone, cluster together. The odourless 
control measurements (mineral oil, air, N2) appear 
clearly separated from the odourous substances, and 
chemically similar odours end up in the same cluster 
(ethyl- and methyl-3-hydroxy-methanoate). This serves 
as a proof of concept, demonstrating that the real-time 
imaging system can, in principle, both recognise known 
odours and estimate the identity of unknown odours by 
their similarity to reference odours. 

While distances between odour molecules are in part 
well reflected by response pattern distances, this is not 
always the case. For example, iso-eugenol does not fit 
into the heptyl acetate cluster, and 2-propylphenol lacks 
clear responses and therefore ends up in the (odourless) 
oil/air cluster. Further experiments are needed to evalu- 
ate whether representation of chemical identity can be 
optimised by recording more or different response units, 
e.g. in a different focal plane. 

It also needs to be tested whether the observed odour 
responses are stereotypical across many individuals. 
As a first approach, we replicated the experiment from 
Figure 9, finding a high correlation (Pearson correlation 
0.86, p =0.001, Mantel test for correlation of distance 



matrices) between the odour x odour Euclidean distance 
matrices (based on the feature vectors) from both 
experiments, indicating that the relative dissimilarity of 
odours could be conserved between individuals. 
How can the system be applied? 

Artificial chemosenors, so-called electronic noses [2-5] are 
important tools for environmental monitoring, healthcare 
or security applications. They do, however, not yet reach 
the efficiency and sensitivity of biological olfactory sys- 
tems. The real-time software can extract features from cal- 
cium imaging recordings, directly accessing the Drosophila 
antenna as a biological chemosensor. Such feature vectors 
(Figure 9a) can be used to visualise molecular identity, or 
they can be subject to further processing, e.g. by classifiers, 
aiming to determine the identity of an unknown chemical 
substance. There are two points to making the biological 
chemosensor practical: 1) Working with non-invasive bio- 
logical techniques that allow for easy handling of the flies, 
2) Software that can process the continuous stream of 
odour plumes encountered in a real-world application. 

Impact of real-time processing 

Going beyond the specific example of the chemosensor 
application, real-time processing has a wider range of 
applicability that involves any kind of interactive experi- 
mentation. This belongs to future work that is made possi- 
ble with the real-time technology. 
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Figure 9 Analysis of the Drosophila antenna recording, a) Maximum response (after odour stimulation) for response units (y-axis) and a 
series of odour stimulations (x-axis). All odours were dissolved in (odourless) mineral oil, which was also given multiple times as a control. As a 
reference, the odours nonanone and heptyl acetate were applied multiple times. Response units are sorted by response to the first nonanone 
stimulation. Odours are sorted by name. The log colour scale ranges from blue (global min) to red (global max), b) Hierarchical clustering of the 
odour feature vectors from a) using Ward's method (stats package for R) based on Euclidean distances between the feature vectors. Marked 
clusters: Odourless substances (grey), hexanoates (purple), heptyl acetate (green), nonanone (blue). 



Motivating examples 

It is increasingly clear that perception is influenced by 
both the stimulus and the prior state of the brain. For 
example, brain oscillations during the pre-stimulus 
interval influence how a human subject perceives an 
auditory stimulus in an experimental setup targeted at 
multimodal sensory integration [52]. Sensory experience 



without external stimulation, stemming only from the 
current state of the brain, is known from medical phe- 
nomena such as tinnitus [53]. 

For honeybees, there is first evidence in the direction 
that spontaneous background activity of the glomeruli in 
the AL carries information about odours that have been 
encountered recently: Glomerular activity patterns similar 
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to a particular odour response pattern reverberate minutes 
after the actual response has been elicited by odour stimu- 
lation [54]. 

Considering the growing interest in ongoing brain activ- 
ity, it is increasingly important to develop experimental 
strategies that allow stimulus presentations to be condi- 
tional on ongoing brain activity states. With the real-time 
methods presented in this work, glomeruli can be targeted 
because of their responses to odours or because they are 
part of reverberating patterns in spontaneous background 
activity. 

Real-time processing is necessary to answer funda- 
mental questions regarding the role of ongoing brain 
activity: Is it a side-effect that simply occurs as a conse- 
quence of neuron and network properties? Are patterns 
in spontaneous activity actually read out for further pro- 
cessing in the brain? In conditioning experiments [7], 
bees learn to associate an odour with a sugar reward. 
Can rewarding a pattern in spontaneous activity have 
the same effect as rewarding the actual odour? 
Added value by real-time processing 

From a biological perspective, the added value provided 
by the real-time software is that brain activity can be 
interpreted based on processed information. Only milli- 
seconds after the activity occurs, we can regard not only 
raw pixel values, but anatomically distinct and identifi- 
able units, the olfactory glomeruli in our case. 

While analysis of neural data is often is performed 
pixel-wise (or voxel-wise), the brain encodes odours in 
patterns across glomeruli. Being able to work on a glo- 
merulus level allows us to match the odour response 
patterns we observe with known response patterns from 
a database, which can reveal the chemical identity of the 
stimulus molecule. For spontaneous background activity, 
we can analyse the distribution of glomerular patterns 
that informs us about the state the antennal lobe net- 
work is in, i.e. the prior state that is relevant for how 
the stimulus will be perceived. 
How fast is fast enough? 

Closed loop experiments, where measured brain activity 
controls experimental settings, require that data proces- 
sing is faster than recording speed. In calcium imaging 
experiments, images are often recorded at frequencies of 
20 Hz or slower. Thus, any processing of 50 ms/frame or 
faster is appropriate. Recordings with voltage-sensitive 
dyes, for example, are generally useful at 50 Hz or faster: 
The fastest neuronal processes, the action potentials, 
have a duration of 1-3 ms, so recordings at 1000 Hz 
would be ideal. The current speed of 23 ms/frame (Table 
1) is already getting close to the 50 Hz value, but it is still 
far from the ideal 1000 Hz. 

For many experiments, fast processing is a requirement, 
e.g. if we wish to follow and react to fluctuations in 



spontaneous activity. For the chemosensor task, the advan- 
tage lies in the fact that we can directly query a biological 
chemosensor instead of waiting for results from post-hoc 
data analysis. Fast processing reduces the delay of the che- 
mical analysis and allows for high-throughput assays. 

Conclusions 

In the brain, odours are represented as activity patterns 
across many neurons. Calcium imaging is a technique that 
lends itself to extracting such activity patterns, as it allows 
to record many units simultaneously. So far, software for 
calcium imaging data has focussed on offline data proces- 
sing [12-15]. The algorithms and software presented in 
this work process calcium imaging movies online, making 
the neural representations of odours accessible directly 
when they occur. 

Algorithmically, we rely on a matrix factorisation that 
is updated with every new image that arrives from the 
movie stream. A low-rank approximation to the movie 
matrix serves as a compact representation of the calcium 
imaging movie, discarding noise and highlighting neural 
signals. This serves as the basis for further visualisations, 
such as functional maps of the glomeruli in the AL: Glo- 
merulus borders are not defined by anatomy, but by 
function, i.e. activity (in response to odours) of pixels 
over time. This eliminates the need for registration of 
imaging data to anatomical stainings. 

Such maps and the visualisation obtained by low-rank 
approximation reveal the "looks of an odour", the initial 
odour response pattern on the antenna, or, after data inte- 
gration and processing has taken place, the glomerular 
response pattern in the AL. 

Both odour representations have applications that 
profit from real-time processing. The role of the AL net- 
work in shaping the odour response patterns can now be 
investigated using closed-loop experiments, where prior 
system states influence current experimental parameters. 
Staining an array of receptor neurons with a single 
genetic construct, accompanied by online processing, 
provides easy access to odour response patterns, making 
real-time chemosensing with a biological sensor practical. 

Visualising the neural representation of odours serves 
also to map perceptional spaces. Distances between 
odour response patterns are an estimate for perceptional 
(dis)similarity between odours [20] throughout different 
stages of odour information processing in the same indi- 
vidual, and also between individuals and even species, 
leading to species-specific odour perception spaces. 

For such and further applications, the algorithmic and 
visualisation framework developed here enables fully 
automatic processing of odour response data without the 
need for human interaction to define e.g. regions of 
interest. 
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Availability 

Source code is available in Additional file 3. 
Additional material 



Additional file 1: Video documentation, part 1. Experimental setup for 
honeybee brain imaging. 

Additional file 2: Video documentation, part 2. Screen capture from a 
honeybee brain imaging experiment. 

Additional file 3: Source code. Archive containing source code for the 
software presented in this work. Note that TILL Photonics LA 2.0 [49] is 
required for configuring experimental hardware. 
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